2025-01-02 16:51:17.AIbase.14.4k
Google DeepMind Launches New Framework InfAlign: Enhancing Alignment Capability of Language Model Inference
Generative language models face many challenges during the transition from training to practical applications. One major issue is how to achieve optimal performance during the inference stage. Current strategies, such as Reinforcement Learning from Human Feedback (RLHF), primarily focus on improving the model's win rate but often overlook decoding strategies during inference, such as Best-of-N sampling and controlled decoding. This gap between training objectives and actual usage may lead to inefficiencies and affect the quality and reliability of outputs. To address these issues, Google D